Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARC-46: Add end of the dynamic tuple to the head value #212

Closed
wants to merge 4 commits into from

Conversation

joe-p
Copy link
Contributor

@joe-p joe-p commented Jun 15, 2023

Change to ABI dynamic tuple encoding to allow for more efficient reading of complex elements

@joe-p joe-p changed the title Add end of the dynamic tuples to the head value ARCXXXX: Add end of the dynamic tuples to the head value Jun 15, 2023
@joe-p joe-p changed the title ARCXXXX: Add end of the dynamic tuples to the head value ARC-XXXX: Add end of the dynamic tuples to the head value Jun 15, 2023
@joe-p joe-p changed the title ARC-XXXX: Add end of the dynamic tuples to the head value ARC-XXXX: Add end of the dynamic tuple to the head value Jun 16, 2023
@SudoWeezy SudoWeezy changed the title ARC-XXXX: Add end of the dynamic tuple to the head value ARC-39: Add end of the dynamic tuple to the head value Jun 16, 2023
ARCs/arc-0039.md Outdated
According to [ARC-0004](./arc-0004.md), when encoding dynamic types in a tuple the tuple consists of a `head` containing all of the offsets of the actual values in the `tail`. An additional value, the byte offset of the end of the tuple, should be added to the end of the `head` value for more efficient element reading.

## Motivation
When reading a value in a dynamic tuple, the head offset of the element must be extracted as a `uint16` to get the start of the element. To get the end of the element, which is necessary to properly read the element, the following head value can be extracted. This, however, fails to work when the element being read is the last element in its respective dynamic array or tuple. When the element is the last in the array or tuple, one must compute the length of the element. For static types, this is trivial and for dynamic arrays of static types this is also trivial because the length prefix can be used. For more complex nested dynamic types, however, this could involve many levels of extracting lengths and offsets, thus reading an element is `O(N)`, which `N` being the depth of the nested dynamic types. If there was an additional head value containing the end of the dynamic tuple, readnig any element, regardless of its type or position, would be `O(1)` since one can always just extract the following head value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is necessary to properly read the element

I don't see why this is true. Grab the suffix of the input, beginning at the start of the value. The value itself contains everything you need to read the right amount, starting there. There's no particular reason to worry, ahead of time, where the value ends, so just pass the entire suffix to whatever needs to decode it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the value is dynamic it doesn't contain "everything you need". If the value is a dynamic array the length prefix will tell you how many elements are in the array but not the length of those elements. You need to read the last value of the array and get the length for that. And if that value is a dynamic array, you must do the same thing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if the elements are dynamic, their heads are statically sized, and laid out first. You know how many there are from the length. And, if they are dynamic, they will have pointers to their dynamically sized tails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But in the case of nested dynamic types you have to continuously extract the head and calculate length. If we knew the end of the tuple it'd be much easier to use that to extract/substring.

To be clear, the desired goal here is to be able to efficiently extract a single value from an encoded nested dynamic type.

ARCs/arc-0039.md Outdated
Comment on lines 47 to 49
This ARC: `0x0002 0006 000d 0014 0005 48656c6c6f 0005 576f726c64`

Note the addition of the `0014` signifying the end of the tuple/array and `head(x[1])` and `head[x[2]]` being incremented by two to account for this additional value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this is not a great example because 0014 points to the end of the encoding, and that's not very useful here because it's just the length of the byte string.

Can you provide an example where the benefit of including this additional information is clearer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah this example was chosen to demonstrate the encoding rules, but it doesn't really show the rationale. Take a slightly more complex example:

[["Hello", "World"], [1, 2, 3]] encoded as (string[],uint8[])

0004 0018 0002 0004 000b 0005 48656c6c6f 0005 576f726c64 0003 01 02 03

To get the end of x[0][1] you will always need to check if x[0][1] is the last element of x[0]. If it's not, you just extract the next head but if it is you need to go up a level and get the head for x[1].

In this example, it exists but if there was further levels of nesting you would need to check if the parent element is the last element in it's parent and repeat until the next head is found. This means you need to spend opcodes checking the accessor against the prefix for every dynamic parent until you find the first parent tuple that has a dynamic element proceeding the accessed element.

In the context of TEAL compilers, the compiler has no way of knowing whether the accessed element is last or not without using up those opcodes, unless only literal array access is supported.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, you use the length prefix to calculate the length of the accessed element, but if that element is a nested dynamic type then you run into similar complexity of needing to calculate the total length of the last element each level.

Copy link
Contributor

@jannotti jannotti Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you trying to get to the end of x[0][1]?

If your goal is to start decoding x[1], you already know where to start, because the head of x[1] is in a well defined spot (bytes 2-3 of the input), and that tells you that x[1] starts at by 0x18.

If you are not trying to reach x[1], then I suspect you are trying to substr out x[0] in order to give it to some decoding routine, so you'd like to know where it ends. But you don't have to do that. Just give that routine the entire suffix of the input, starting where head(x[0]) tells you to begin (byte 4).

You say it's hard "To get the end of x[0][1]" and I am saying that is not an operation you should need directly. If you are going down the dynamic path of x[0], you will find out where x[0][1] ends when you examine the head of x[0][1], combined with the length where head(x[0][1]) points. If you don't actually care about x[0][1] itself and are really just trying to get to the end of x[0], you know where x[1] begins.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to get the value of x[0][1] as efficiently as possible, so get the start of x[0][1], the end of x[0][1] and use substring3.

combined with the length where head(x[0][1])

Which is great when x[0][1] is of type T[] and T is static, but if T is dynamic then you need spend opcode budget to calculate the total length of x[0][1]. For every level of dynamic nested types in T, it consumes opcodes to calculate the total length. Having a value that points to the end of x[0][1] in the encoded data structure makes it trivial to get the end, rather than potentially having to calculate all of the lengths.

Copy link
Contributor

@jannotti jannotti Jun 23, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained why substr'ing out the value is not necessary.

I suspect you are trying to substr out x[0] in order to give it to some decoding routine, so you'd like to know where it ends. But you don't have to do that. Just give that routine the entire suffix of the input, starting where head(x[0]) tells you to begin (byte 4).

Surely it's even fewer opcodes to grab the suffix than to obtain the end (in any manner, whether by calculation or by pulling out the phantom length you want to add).

@SudoWeezy
Copy link
Collaborator

On hold, for the time being, will move forward if more people request it.

@joe-p
Copy link
Contributor Author

joe-p commented Jun 23, 2023

To give some more context, after the public ARC meeting on discord we feel like this could still be useful for complex data structures, but not sure how common complex data structures are. If we find developers are wanting to work with complex data structures then this can be revisited.

@SudoWeezy SudoWeezy changed the title ARC-39: Add end of the dynamic tuple to the head value ARC-45: Add end of the dynamic tuple to the head value Jul 10, 2023
@SudoWeezy SudoWeezy changed the title ARC-45: Add end of the dynamic tuple to the head value ARC-46: Add end of the dynamic tuple to the head value Jul 10, 2023
@joe-p joe-p mentioned this pull request Aug 8, 2023
@joe-p joe-p closed this Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants